Approximate statistical alignment by iterative sampling of substitution matrices

نویسندگان

  • Joseph L. Herman
  • Adrienn Szabó
  • István Miklós
  • Jotun Hein
چکیده

We outline a procedure for jointly sampling substitution matrices and multiple sequence alignments, according to an approximate posterior distribution, using an MCMC-based algorithm. This procedure provides an efficient and simple method by which to generate alternative alignments according to their expected accuracy, and allows appropriate parameters for substitution matrices to be selected in an automated fashion. In the cases considered here, the sampled alignments with the highest likelihood have an accuracy consistently higher than alignments generated using the standard BLOSUM62 matrix.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing substitution matrices by separating score distributions

MOTIVATION Homology search is one of the most fundamental tools in Bioinformatics. Typical alignment algorithms use substitution matrices and gap costs. Thus, the improvement of substitution matrices increases accuracy of homology searches. Generally, substitution matrices are derived from aligned sequences whose relationships are known, and gap costs are determined by trial and error. To discr...

متن کامل

Substitution Matrices and Mutual Information Approaches to Modeling Evolution

Substitution matrices are at the heart of Bioinformatics: sequence alignment, database search, phylogenetic inference, protein family classi cation are based on Blosum, Pam, JTT, mtREV24 and other matrices. These matrices provide means of computing models of evolution and assessing the statistical relationships amongst sequences. This paper reports two results; rst we show how Bayesian and grid...

متن کامل

A Transition Probability Model for Amino Acid Substitutions from Blocks

Substitution matrices have been useful for sequence alignment and protein sequence comparisons. The BLOSUM series of matrices, which had been derived from a database of alignments of protein blocks, improved the accuracy of alignments previously obtained from the PAM-type matrices estimated from only closely related sequences. Although BLOSUM matrices are scoring matrices now widely used for pr...

متن کامل

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments.

The relative performances of four strategies for aligning a large number of protein sequences were assessed by referring to corresponding structural alignments of 54 independent families. Multiple sequence alignment of a family was constructed by a given method from the sequences of known structures and their homologues, and the subset consisting of the sequences of known structures was extract...

متن کامل

Gauss-Sidel and Successive Over Relaxation Iterative Methods for Solving System of Fuzzy Sylvester Equations

In this paper, we present Gauss-Sidel and successive over relaxation (SOR) iterative methods for finding the approximate solution system of fuzzy Sylvester equations (SFSE), AX + XB = C, where A and B are two m*m crisp matrices, C is an m*m fuzzy matrix and X is an m*m unknown matrix. Finally, the proposed iterative methods are illustrated by solving one example.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1501.04986  شماره 

صفحات  -

تاریخ انتشار 2015